Examination of Temporal ICD Coding Bias Related to Acute Diseases

نویسنده

  • Mollie McKillop
چکیده

Electronic Health Records (EHRs) hold great promise for secondary data reuse but have been reported to contain severe biases. The temporal characteristics of coding biases remain unclear. This study used a survival analysis approach to reveal temporal bias trends for coding acute diabetic conditions among 268 diabetes patients. For glucose-controlled ketoacidosis patients we found it took an average of 7.5 months for the incorrect code to be removed, while for glucose-controlled hypoglycemic patients it took an average of 9 months. We also examined blood glucose lab values and performed a case review to confirm the validity of our findings. We discuss the implications of our findings and propose future work. 1.0 Introduction Administrative data, namely ICD (International Classification of Diseases) codes, have been widely used to identify disease specific cohorts of patients. Such codes are often used in clinical and health services research because they are easy to obtain, have low associated costs, and can be aggregated to form large study samples. The accuracy of study results derived from administrative data depends on how well a particular coding scheme can correctly describe a disease cohort of interest. It is especially important to document coding biases as EHRs become more widely adopted and relied upon for large-scale data reuse. Increasingly, EHRs are used to document macroscopic human conditions, or phenotypes, automatically feeding data for secondary re-use purposes such as clinical research, quality improvement, and public health initiatives. Such uses require high-quality data, which are often lacking in the EHR. Coding bias is important to document and characterize for diabetes, which is an increasingly prevalent disease and is a major source or morbidity and mortality. It is a leading cause of blindness, end-stage renal disease and cardiovascular disease and is associated with high healthcare costs. Coding bias related to complications of diabetes, specifically ketoacidosis and hypoglycemia, are particularly important to understand because they are acute, life-threatening conditions that require hospitalization. Accurate results from studies using the EHR to phenotype these patients must be aware of any coding bias related to these conditions. The validity of ICD-codes for identifying patient groups has been challenged many times before and for a variety of conditions. These studies have shown that ICD codes are biased because concept definitions for codes are incomplete or are unsatisfactory in granularity. Moreover, variability in coding behavior also leads to incorrect code assignment. Researchers have previously questioned the validity and generalizability of ICD codes as applied to diabetes . However, these studies have not examined coding bias among complications of diabetes and uncontrolled glucose levels. To date, we could only find two studies that examined coding bias associated with ketoacidosis. They are small in scale and focus on data captured before 2009. We seek to build upon this work by validating these results and examining bias among a larger, more diverse, more recently treated population of patients. We also seek to report coding bias related to hypoglycemia, which to our knowledge has not previously been documented. Recognizing that human phenotypes are time-dependent, we aim to describe the temporal bias associated with ketoacidosis and hypoglycemia ICD-9 codes. Previous research has neglected the dynamic nature of such phenotypes when examining coding bias. Temporal bias is important to understand for accurate phenotyping and the full-realization of purported EHR benefits. This study uses acute complications of diabetes and uncontrolled glucose as examples to investigate the temporality of ICD coding bias. Specifically, we ask the question, “do patients, who are initially coded for ketoacidosis or hypoglycemia, remain coded as such, despite controlling their glucose levels?” In other words, we hypothesize that patients who initially receive an ICD-9 code assignment for either ketoacidosis or hypoglycemia continue to be assigned these codes, despite little clinical evidence that such code assignment is reasonable. We examine the extent of this bias over time by using survival analysis to determine the time it takes, on average, for patients to have the incorrect code removed from their personal EHR. We also examine this bias for different disease subgroups (Type 1 versus Type 2) and among glucose-controlled and uncontrolled patients. Finally, we hope to describe our method in enough detail to allow other researchers to replicate our findings and test the strength of our conclusions by examining temporal coding bias among other acute conditions. This study was performed in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects and was approved by the Columbia University Medical Center Institutional Review Board. 2.0 Methods 2.1 Data description and processing For our study we utilized the Columbia University Medical Center Clinical Data Warehouse (CDW), which contains 24 years of data on 4.5 million patients. Data were extracted in a multi-step process as depicted in Figure 1 to generate our cohort for analysis. First, relevant ICD-9 codes were identified using the Center for Medicare and Medicaid (CMS) ICD-9 Code Lookup. The relevant codes are listed in Table 1. All instances of these codes from 2004 to October 2014 along with corresponding fake patient-identification numbers and real timestamps for each code were extracted. This resulted in 4844 unique patients, 2242 with ketoacidosis and 2602 with hypoglycemia. Along with the ICD-9 codes, all HgA1c lab values for these patients along with fake patient IDs and real timestamps were generated from the CDW. Lab values occurring after the initial coding of ketoacidosis or hypoglycemia were selected. Patients were then separated by ketoacidosis and hypoglycemia. After selecting only the lab values occurring after the initial coding for ketoacidosis, 1309 patients were retained; after selecting only the lab values occurring after the initial coding for hypoglycemia, 1129 patients remained. In the ketoacidosis subgroup, 112 patients were selected for analysis based on having controlled glucose levels, which was defined as having a median HgA1c lab value less than 7, a threshold recommended by the American Diabetes Association. In the hypoglycemia group, 156 patients were selected for analysis based on having a similarly defined controlled glucose level. Table 1. ICD-9 CM codes for ketoacidosis and hypoglycemia. Definitions are from CMS’ ICD-9 Code Lookup. ICD-9 Code Description 250.10 DIABETES WITH KETOACIDOSIS, TYPE II OR UNSPECIFIED TYPE, NOT STATED AS UNCONTROLLED 250.11 DIABETES WITH KETOACIDOSIS, TYPE I [JUVENILE TYPE], NOT STATED AS UNCONTROLLED 250.12 DIABETES WITH KETOACIDOSIS, TYPE II OR UNSPECIFIED TYPE, UNCONTROLLED 250.13 DIABETES WITH KETOACIDOSIS, TYPE I [JUVENILE TYPE], UNCONTROLLED 251.10 OTHER SPECIFIED HYPOGLYCEMIA 251.20 HYPOGLYCEMIA UNSPECIFIED 2.2 Data analysis Survival analysis was performed separately on the remaining 112 ketoacidosis and 156 hypoglycemia observations to determine the probability of remaining coded for ketoacidosis or hypoglycemia, despite having controlled glucose levels. Survival analysis was chosen as the statistical method for analysis because it allows for the time to an event to be computed. In this case, the event was decoding of a glucose-controlled patient. For the ketoacidosis group, the granularity of the ICD-9 codes allowed us to further explore any coding bias among disease subgroups. First, individuals who were identified as Type 1 or Type 2 ketoacidosis patients by ICD-9 codes had their time to decoding compared and assessed for statistical significance using a cox proportional hazards test. Second, individuals who were identified as uncontrolled or controlled ketoacidosis patients by ICD-9 codes had their time to decoding compared and assessed for statistical significance again using a cox proportional hazard test. Because HgA1c lab tests measure a patient’s average glucose level over several weeks, we looked at the blood glucose lab tests 14 days before any ICD-9 code timestamp after the first ICD-9 code assignment. This time frame was chosen based on administrative timetables, since reimbursement claims are generally submitted within two weeks of any services or procedures provided. Also, the ICD-9 timestamp represents the discharge date. Considering the acute nature of the diseases under study, we felt this was an adequate time frame to select blood glucose lab results from. The blood glucose test was selected because it provides a way to characterize patients as either in a state of hypoglycemia or ketoacidosis. According to UpToDate, a blood glucose serum level over 350 can be used to diagnose ketoacidosis while a blood glucose serum level under 40 can be used to diagnose hypoglycemia. The blood glucose lab values were extracted from the CDW using the Medical Entities Dictionary, which is a large repository of medical concepts that are drawn from a variety of sources either developed or used at NYP. For each Figure 1. Cohort selection for analysis. patient, the mean and median percentage of times these blood glucose lab values indicated disease for either the ketoacidosis or the hypoglycemia group was calculated. The analysis was performed using Rstudio version 0.98.1091. The datasets and R code are available for public use by contacting the primary author (MM). Finally, a case review of deceased patient notes for both the ketoacidosis group and the hypoglycemia group was performed by a physician (FP) to further examine the validity of our results and perform an error analysis. Since patients were selected from groups dependent upon ICD-9 code assignment, the clinical reviewer provided a comprehensive chart review of the deceased patient notes from both the ketoacidosis and hypoglycemia group. The main purpose of this review was to contribute to an error analysis and provide a definitive “gold standard” of whether the patient had ketoacidosis or hypoglycemia. The Annals of Emergency Medicine guidelines for chart review were followed as much as possible. The participating reviewer (FP) is a clinician and knew the purpose of the study but not the study’s full outcome. The variables to be collected from the chart, as well as how these variables are defined, were determined a priori and documented for the coder. Variables were selected using UptoDate Guidelines to construct coding rules. Once a positive diagnosis based on the coded variables was determined by the reviewer, no further case review for that patient was performed. Overall, the reviewer examined 5 ketoacidosis patients and 13 hypoglycemic patients. 3.0 Results 3.1 Descriptive statistics In total there were 112 patients in the ketoacidosis group and 156 patients in the hypoglycemia group. All patients that occurred in the ketoacidosis group also occurred in the hypoglycemia group. In the ketoacidosis group, there were a median of 5 HgA1c lab values per patient with a minimum of 1 and a maximum of 40. In the hypoglycemia group, there were a median of 10 HgA1c lab values per patient with a minimum of 2 and a maximum of 203. The trend in HgA1c lab values over time by patient for the ketoacidosis group (left) and hypoglycemia group (right) is shown in Figure 2. Figure 2. HgA1c linear trend for ketoacidosis patients (left) and hypoglycemia patients (right). Each line represents a unique patient. Horizontal axis is sequential order of HgA1c lab tests; Vertical axis is HgA1c lab value. A generalized estimating equation, accounting for unequally spaced observations, was fit for the ketoacidosis and hypoglycemia groups separately to test for significant linear trend associations between HgA1c and time. The hypoglycemia group was found to have a significant association (p-value = 0.0045) while the ketoacidosis did not have such an association (p-value = 0.11). Both tests had a negative coefficient. In the ketoacidosis group, there were a median of 15.5 ICD-9 code assignments per patient with a minimum of 1 and a maximum of 211. In the hypoglycemia group, there were a median of 10 ICD-9 code assignments per patient with a minimum of 2 and a maximum of 203. Among the ketoacidosis group split by diabetes type as determine by ICD-9 codes, there were 39 Type 1 patients and 73 Type 2 patients. Among the ketoacidosis group split by uncontrolled versus controlled diabetes type as determine by ICD-9 codes, there were 53 ‘not stated as uncontrolled’ patients and 59 ‘uncontrolled’ patients. 3.2 Survival analysis for the ketoacidosis group The Kaplan-Meier survival curve for the 112 observations is shown in Figure 3. Time was recorded in 3-month intervals to reflect national guidelines for diabetes screening. According to the survival curve, it takes approximately 7.5 months for 50% of patients, who were initially coded for ketoacidosis and have their glucose in control, to stop being assigned any of the four ICD-9 codes listed in Table 1. Figure 3. Kaplan-Meier survival curve for all ketoacidosis patients. Time is in 3-month intervals. Dashed lines represent confidence intervals for the survival curve. A cox-proportional hazard test was performed to determine if significant differences existed between any of the four ICD-9 codes. This test was found to be significant at a level of .01 (p-value=0.00538), indicating at least one of the ICD-9 code survival curves is different from the others. This difference was examined further by separating the population into different groups based on diabetes type and ‘uncontrolled’ versus ‘not stated as uncontrolled’ ICD-9 code assignment. The survival curves for Type 1 and Type 2 diabetes are displayed in Figure 4. These curves were not significantly different from each other at a level of .05 (p-value=0.316) using the cox-proportional hazards test. Figure 4. Kaplan-Meier survival curve for ketoacidosis patients by diabetes type. Time is in 3-month intervals. The blue curve are Type 1 diabetics and green curve are Type 2 diabetics. A third cox-proportional hazard test determined a significant difference (p-value=.00821) at a level of .01 between the Kaplan-Meier survival curves of patients coded for ‘uncontrolled’ (ICD-9 codes 250.12 and 250.13) versus patients coded for ‘not stated as uncontrolled’ (ICD-9 codes 250.10 and 250.11). The survival curves are displayed in Figure 5 and indicate patients coded for ‘not stated as uncontrolled ketoacidosis’ (blue line) are decoded from any ketoacidosis ICD-9 code 25.1% faster than patients coded for ‘uncontrolled ketoacidosis’ (green line). Figure 5. Kaplan-Meier survival curve for ketoacidosis patients by ICD-9 code assignment for ‘Not Stated as Uncontrolled ’ (ICD-9 codes 250.10 and 250.11) vs. ‘Uncontrolled’ (ICD-9 codes 250.12 and 250.13). Time is in 3month intervals. The blue curve is for ‘Not Stated as Uncontrolled’ diabetics and the green curve is for ‘Uncontrolled’ diabetics. 3.3 Glucose lab value assessment for the ketoacidosis group Patients were selected from those that had one or more ICD-9 code assignments for ketoacidosis after the initial coding for the disease. The glucose lab tests occurring 14 days before any ICD-9 code timestamp, except the first timestamp, were selected and assessed for positive indication of ketoacidosis based on a blood glucose cutoff level of 350 mg/dL as determined by UpToDate. Only 90 out of the 112 patients had blood glucose labs occurring 14 days before the ICD code timestamp values under consideration. Positive disease indications were assigned a value of one and tabulated by patient. The mean number of positive disease indications was 210 with a median of 195, while the median percentage of positive disease indications was 6.33% of all blood glucose lab tests per patient. The mean percentage of positive disease indications over the number of blood glucose labs was calculated per patient and the distribution is shown in Figure 6. Figure 6. Distribution of mean positive ketoacidosis indication by sequential ICD code assignments. 3.4 Survival analysis for the hypoglycemia group The Kaplan-Meier survival curve for the 156 hypoglycemia patients is shown in Figure 7. Time was recorded in 3month intervals to reflect national guidelines for diabetes screening. According to the survival curve, it takes 9 months for 50% of patients, who were initially coded for hypoglycemia and have their glucose in control, to stop being assigned any of the ICD-9 codes for ketoacidosis. Unfortunately, no further survival analysis of hypoglycemia subgroups could be performed due to the limited granularity of ICD-9 codes in this category. Figure 7. Kaplan-Meier survival curve for all hypoglycemia patients. Time is in 3-month intervals. Dashed lines represent confidence intervals for the survival curve. 3.5 Glucose lab value analysis for the hypoglycemic group Patients were selected from those that who had one or more ICD-9 code assignments for hypoglycemia after the initial coding for the disease. The glucose lab tests occurring 14 days before any ICD-9 code timestamp, except this first timestamp, were selected and assessed for positive indication of hypoglycemia based on a blood glucose cutoff level of 40 mg/dL as determined by UpToDate. Only 33 out of the 156 patients had blood glucose labs occurring 14 days before the ICD code timestamp values under consideration. Positive indications were assigned a value of one and tabulated by patient. The mean number of positive disease indications was 120 with a median of 109, while the mean percentage was .8977% and the median percentage was 0%. The percentage of positive disease indications over the number of blood glucose labs was calculated per patient and the distribution is shown in Figure 8. Figure 8. Distribution of mean positive hypoglycemia disease indication by sequential ICD code assignments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploration of Temporal ICD Coding Bias Related to Acute Diabetic Conditions

Electronic Health Records (EHRs) hold great promise for secondary data reuse but have been reported to contain severe biases. The temporal characteristics of coding biases remain unclear. This study used a survival analysis approach to reveal temporal bias trends for coding acute diabetic conditions among 268 diabetes patients. For glucose-controlled ketoacidosis patients we found it took an av...

متن کامل

طراحی نرم‌افزار سیستم ثبت بیماری‌های دهان و فک و صورت براساس آخرین به روز رسانی سیستم طبقه‌بندی ICD-10 سازمان جهانی بهداشت در سال 2010

  Background and Aims: Classification is a fundamental issue in quantitative studies of any phenomenon and has been known as a necessity for the advancement of science. Using a standard record system for diseases is critical for research purposes and also could improve the quality of medical health services. In this study, after evaluating current codding systems in oral medicine, we designed a...

متن کامل

ICD-10 coding algorithms for defining comorbidities of acute myocardial infarction

BACKGROUND With the introduction of ICD-10 throughout Canada, it is important to ensure that Acute Myocardial Infarction (AMI) comorbidities employed in risk adjustment methods remain valid and robust. Therefore, we developed ICD-10 coding algorithms for nine AMI comorbidities, examined the validity of the ICD-10 and ICD-9 coding algorithms in detection of these comorbidities, and assessed thei...

متن کامل

Acute on chronic renal failure 2 2 ANCA vasculitis ICD 10 code

Pulmonary-renal syndrome refers to patients with DAH (or pathologic pulmonary purpura on physical examination implies a small-vessel, cutaneous vasculitis ( 10).. The ANCA-associated vasculitides, WG, CSS, and MPA, are grouped. . The syndrome is characterized by a triad of (1) asthma, (2) hypereosinophilia, and . Granulomatosis with polyangiitis (GPA), previously known as Wegener's granulomatos...

متن کامل

The Quality of Coding Medical Records of Cancer Patients Based on ICD-10 in Hospitals of Hormozgan University of Medical Sciences

Introduction: The aim of this study was to determine the status of ICD-10 codes assigned to cancer patients' medical records in terms of three attributes of accuracy, completeness, and timeliness. Method: in this cross-sectional descriptive study, 374 medical files with C00-D48 diagnosis codes were selected through stratified sampling. Data gathering tool was a researcher-made checklist consist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015